725 research outputs found
RVD2: An ultra-sensitive variant detection model for low-depth heterogeneous next-generation sequencing data
Motivation: Next-generation sequencing technology is increasingly being used for clinical diagnostic tests. Unlike research cell lines, clinical samples are often genomically heterogeneous due to low sample purity or the presence of genetic subpopulations. Therefore, a variant calling algorithm for calling low-frequency polymorphisms in heterogeneous samples is needed. Result: We present a novel variant calling algorithm that uses a hierarchical Bayesian model to estimate allele frequency and call variants in heterogeneous samples. We show that our algorithm improves upon current classifiers and has higher sensitivity and specificity over a wide range of median read depth and minor allele frequency. We apply our model and identify twelve mutations in the PAXP1 gene in a matched clinical breast ductal carcinoma tumor sample; two of which are loss-of-heterozygosity events
Recommended from our members
RVD2: an ultra-sensitive variant detection model for low-depth heterogeneous next-generation sequencing data
MOTIVATION: Next-generation sequencing technology is increasingly being used for clinical diagnostic tests. Clinical samples are often genomically heterogeneous due to low sample purity or the presence of genetic subpopulations. Therefore, a variant calling algorithm for calling low-frequency polymorphisms in heterogeneous samples is needed. RESULTS: We present a novel variant calling algorithm that uses a hierarchical Bayesian model to estimate allele frequency and call variants in heterogeneous samples. We show that our algorithm improves upon current classifiers and has higher sensitivity and specificity over a wide range of median read depth and minor allele fraction. We apply our model and identify 15 mutated loci in the PAXP1 gene in a matched clinical breast ductal carcinoma tumor sample; two of which are likely loss-of-heterozygosity events
Liver Fibrosis Surface Assessment Based on Non-Linear Optical Microscopy
Ph.DDOCTOR OF PHILOSOPH
Predicting the amount of coke deposition on catalyst through image analysis and soft computing
The amount of coke deposition on catalyst pellets is one of the most important indexes of catalytic property and service life. As a result, it is essential to measure this and analyze the active state of the catalysts during a continuous production process. This paper proposes a new method to predict the amount of coke deposition on catalyst pellets based on image analysis and soft computing. An image acquisition system consisting of a flatbed scanner and an opaque cover is used to obtain catalyst images. After imaging processing and feature extraction, twelve effective features are selected and two best feature sets are determined by the prediction tests. A neural network optimized by a particle swarm optimization algorithm is used to establish the prediction model of the coke amount based on various datasets. The root mean square error of the prediction values are all below 0.021 and the coefficient of determination R 2, for the model, are all above 78.71%. Therefore, a feasible, effective and precise method is demonstrated, which may be applied to realize the real-time measurement of coke deposition based on on-line sampling and fast image analysis
Experimental investigation of the isothermal section at 400 °C of the MgCeSr ternary system
AbstractThe objective of this study is to determine the isothermal section at 400 °C of the MgCeSr system. In this study, the constitution of the CeSr system and the MgCeSr system have been investigated over the entire composition range using X-ray diffraction (XRD), field emission scanning electron microscope (SEM) and energy dispersive spectroscopy (EDS). No any new binary compound has been found in the CeSr system and no ternary compound has been found in the MgCeSr system also. Nine three-phase regions have been experimentally observed. Six binary phases Mg2Sr, Mg23Sr6, Mg38Sr9, Mg17Sr2, Mg12Ce, Mg41Ce5 are detected dissolving about 3–7 at.% the third element. This study first detected the experimental data of the CeSr binary system and determined the isothermal section at 400 °C of the MgCeSr ternary system
Highly-Accurate Electricity Load Estimation via Knowledge Aggregation
Mid-term and long-term electric energy demand prediction is essential for the
planning and operations of the smart grid system. Mainly in countries where the
power system operates in a deregulated environment. Traditional forecasting
models fail to incorporate external knowledge while modern data-driven ignore
the interpretation of the model, and the load series can be influenced by many
complex factors making it difficult to cope with the highly unstable and
nonlinear power load series. To address the forecasting problem, we propose a
more accurate district level load prediction model Based on domain knowledge
and the idea of decomposition and ensemble. Its main idea is three-fold: a)
According to the non-stationary characteristics of load time series with
obvious cyclicality and periodicity, decompose into series with actual economic
meaning and then carry out load analysis and forecast. 2) Kernel Principal
Component Analysis(KPCA) is applied to extract the principal components of the
weather and calendar rule feature sets to realize data dimensionality
reduction. 3) Give full play to the advantages of various models based on the
domain knowledge and propose a hybrid model(XASXG) based on Autoregressive
Integrated Moving Average model(ARIMA), support vector regression(SVR) and
Extreme gradient boosting model(XGBoost). With such designs, it accurately
forecasts the electricity demand in spite of their highly unstable
characteristic. We compared our method with nine benchmark methods, including
classical statistical models as well as state-of-the-art models based on
machine learning, on the real time series of monthly electricity demand in four
Chinese cities. The empirical study shows that the proposed hybrid model is
superior to all competitors in terms of accuracy and prediction bias
Dynamic Snake Convolution based on Topological Geometric Constraints for Tubular Structure Segmentation
Accurate segmentation of topological tubular structures, such as blood
vessels and roads, is crucial in various fields, ensuring accuracy and
efficiency in downstream tasks. However, many factors complicate the task,
including thin local structures and variable global morphologies. In this work,
we note the specificity of tubular structures and use this knowledge to guide
our DSCNet to simultaneously enhance perception in three stages: feature
extraction, feature fusion, and loss constraint. First, we propose a dynamic
snake convolution to accurately capture the features of tubular structures by
adaptively focusing on slender and tortuous local structures. Subsequently, we
propose a multi-view feature fusion strategy to complement the attention to
features from multiple perspectives during feature fusion, ensuring the
retention of important information from different global morphologies. Finally,
a continuity constraint loss function, based on persistent homology, is
proposed to constrain the topological continuity of the segmentation better.
Experiments on 2D and 3D datasets show that our DSCNet provides better accuracy
and continuity on the tubular structure segmentation task compared with several
methods. Our codes will be publicly available.Comment: Accepted by ICCV 202
- …